Text/Graphics Separation Revisited
Identifieur interne : 008267 ( Main/Exploration ); précédent : 008266; suivant : 008268Text/Graphics Separation Revisited
Auteurs : Karl Tombre ; Salvatore Tabbone ; Loïc Pélissier ; Bart Lamiroy ; Philippe DoschSource :
English descriptors
Abstract
Text/graphics separation aims at segmenting the document into two layers : a layer assumed to contain text and a layer containing graphical objects. In this paper, we present a consolidation of a method proposed by Fletcher and Kasturi, with a number of improvements to make it more suitable for graphics-rich documents. We discuss the right choice of thresholds for this method, and their stability. We also propose a post-processing step for retrieving text components touching the graphics, through local segmentation of the distance skeleton.
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Crin, to step Corpus: 003286
- to stream Crin, to step Curation: 003286
- to stream Crin, to step Checkpoint: 000F55
- to stream Main, to step Merge: 008723
- to stream Main, to step Curation: 008267
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" wicri:score="86">Text/Graphics Separation Revisited</title>
</titleStmt>
<publicationStmt><idno type="RBID">CRIN:tombre02a</idno>
<date when="2002" year="2002">2002</date>
<idno type="wicri:Area/Crin/Corpus">003286</idno>
<idno type="wicri:Area/Crin/Curation">003286</idno>
<idno type="wicri:explorRef" wicri:stream="Crin" wicri:step="Curation">003286</idno>
<idno type="wicri:Area/Crin/Checkpoint">000F55</idno>
<idno type="wicri:explorRef" wicri:stream="Crin" wicri:step="Checkpoint">000F55</idno>
<idno type="wicri:Area/Main/Merge">008723</idno>
<idno type="wicri:Area/Main/Curation">008267</idno>
<idno type="wicri:Area/Main/Exploration">008267</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Text/Graphics Separation Revisited</title>
<author><name sortKey="Tombre, Karl" sort="Tombre, Karl" uniqKey="Tombre K" first="Karl" last="Tombre">Karl Tombre</name>
</author>
<author><name sortKey="Tabbone, Salvatore" sort="Tabbone, Salvatore" uniqKey="Tabbone S" first="Salvatore" last="Tabbone">Salvatore Tabbone</name>
</author>
<author><name sortKey="Pelissier, Loic" sort="Pelissier, Loic" uniqKey="Pelissier L" first="Loïc" last="Pélissier">Loïc Pélissier</name>
</author>
<author><name sortKey="Lamiroy, Bart" sort="Lamiroy, Bart" uniqKey="Lamiroy B" first="Bart" last="Lamiroy">Bart Lamiroy</name>
</author>
<author><name sortKey="Dosch, Philippe" sort="Dosch, Philippe" uniqKey="Dosch P" first="Philippe" last="Dosch">Philippe Dosch</name>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>document analysis</term>
<term>segmentation</term>
<term>text/graphics separation</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en" wicri:score="1686">Text/graphics separation aims at segmenting the document into two layers : a layer assumed to contain text and a layer containing graphical objects. In this paper, we present a consolidation of a method proposed by Fletcher and Kasturi, with a number of improvements to make it more suitable for graphics-rich documents. We discuss the right choice of thresholds for this method, and their stability. We also propose a post-processing step for retrieving text components touching the graphics, through local segmentation of the distance skeleton.</div>
</front>
</TEI>
<affiliations><list></list>
<tree><noCountry><name sortKey="Dosch, Philippe" sort="Dosch, Philippe" uniqKey="Dosch P" first="Philippe" last="Dosch">Philippe Dosch</name>
<name sortKey="Lamiroy, Bart" sort="Lamiroy, Bart" uniqKey="Lamiroy B" first="Bart" last="Lamiroy">Bart Lamiroy</name>
<name sortKey="Pelissier, Loic" sort="Pelissier, Loic" uniqKey="Pelissier L" first="Loïc" last="Pélissier">Loïc Pélissier</name>
<name sortKey="Tabbone, Salvatore" sort="Tabbone, Salvatore" uniqKey="Tabbone S" first="Salvatore" last="Tabbone">Salvatore Tabbone</name>
<name sortKey="Tombre, Karl" sort="Tombre, Karl" uniqKey="Tombre K" first="Karl" last="Tombre">Karl Tombre</name>
</noCountry>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 008267 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 008267 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Lorraine |area= InforLorV4 |flux= Main |étape= Exploration |type= RBID |clé= CRIN:tombre02a |texte= Text/Graphics Separation Revisited }}
This area was generated with Dilib version V0.6.33. |